Contrast in concept-to-speech generation
نویسنده
چکیده
In concept-to-speech systems, spoken output is generated on the basis of a text that has been produced by the system itself. In such systems, linguistic information from the text generation component may be exploited to achieve a higher prosodic quality of the speech output than can be obtained in a plain text-to-speech system. In this paper we discuss how information from natural language generation can be used to compute prosody in a concept-to-speech system, focusing on the automatic marking of contrastive accents on the basis of information about the preceding discourse. We discuss and compare some formal approaches to this problem and present the results of a small perception experiment that was carried out to test which discourse contexts trigger a preference for contrastive accent, and which do not. Finally, we describe a method for marking contrastive accent in a generic concept-to-speech system called D2S. In D2S, contrastive accent is assigned to generated phrases expressing different aspects of similar events. Unlike in previous approaches, there is no restriction on the kind of entities that may be considered contrastive. This is in line with the observation that, given the 'right' context, any two items may stand in contrast to each other.
منابع مشابه
جُستاری در رویکرد دیالکتیکی به «خواندن»
Purpose: This article tries to explain that reading is a dialectical action. For this purpose, it refers to the concept of dialectics in ancient times and, with a glance at the concepts of man, world, science, language and knowledge, it tries to discuss the dialectical status of reading. Method: In the present article, a conceptual analysis approach has been used. This approach that is used i...
متن کاملWhat concept-to-speech can gain for prosody
This article proposes a concept-to-speech system with automated prosody learning based on reinforcement learning. The concept-to-speech system, named Demosthenes, is an extension of the text-to-speech system DreSS. Demosthenes is responsible for template-based text generation and symbolic prosody prediction, while DreSS takes care of acoustic prosody and speech synthesis. The prosody predictor ...
متن کاملPhonological Mean Length of Utterance in 48-60-Month-old Persian-speaking Children with Isfahani Accent: Comparison of Story Generation and Conversation Samples
Objective:Phonological Mean Length of Utterance (PMLU), a quantitative measure for assessment of phonological skills, has been considered in developmental studies as a diagnostic and clinical criterion in phonological development. Moreover, it is an indicator rate of the efficacy of the intervention. The PMLU is a word level measure that can be calculated on the child’s transcribed speech sampl...
متن کاملImproving statistical natural concept generation in interlingua-based speech-to-speech translation
Natural concept generation is critical to statistical interlinguabased speech translation performance. To improve maximumentropy-based concept generation, a set of novel features and algorithms are proposed including features enabling model training on parallel corpora, employment of confidence thresholds and multiple sets of features. The concept generation error rate is reduced by 43%-50% in ...
متن کاملUse of maximum entropy in natural word generation for statistical concept-based speech-to-speech translation
Our statistical concept-based spoken language translation method consists of three cascaded components: natural language understanding, natural concept generation and natural word generation. In the previous approaches, statistical models are used only in the first two components. In this paper, a novel maximum-entropy-based statistical natural word generation algorithm is proposed that takes i...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Computer Speech & Language
دوره 16 شماره
صفحات -
تاریخ انتشار 2002